Overview

Dataset statistics

Number of variables15
Number of observations6523
Missing cells2570
Missing cells (%)2.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 MiB
Average record size in memory505.2 B

Variable types

NUM10
CAT5

Warnings

name has a high cardinality: 6351 distinct values High cardinality
host_name has a high cardinality: 1902 distinct values High cardinality
neighbourhood has a high cardinality: 77 distinct values High cardinality
last_review has a high cardinality: 820 distinct values High cardinality
last_review has 1285 (19.7%) missing values Missing
reviews_per_month has 1285 (19.7%) missing values Missing
price is highly skewed (γ1 = 20.25399533) Skewed
name is uniformly distributed Uniform
id has unique values Unique
number_of_reviews has 1285 (19.7%) zeros Zeros
availability_365 has 1797 (27.5%) zeros Zeros

Reproduction

Analysis started2021-01-20 02:39:47.185658
Analysis finished2021-01-20 02:40:03.252345
Duration16.07 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

id
Real number (ℝ≥0)

UNIQUE

Distinct6523
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29368180.91
Minimum2384
Maximum47141177
Zeros0
Zeros (%)0.0%
Memory size51.1 KiB

Quantile statistics

Minimum2384
5-th percentile4245243
Q119478989
median31990014
Q341310540
95-th percentile46331194.7
Maximum47141177
Range47138793
Interquartile range (IQR)21831551

Descriptive statistics

Standard deviation13434526.95
Coefficient of variation (CV)0.4574517907
Kurtosis-0.9110283935
Mean29368180.91
Median Absolute Deviation (MAD)10450481
Skewness-0.4930470015
Sum1.915686441e+11
Variance1.804865143e+14
MonotocityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
434216941< 0.1%
 
456698171< 0.1%
 
157300741< 0.1%
 
96106541< 0.1%
 
74088121< 0.1%
 
203790441< 0.1%
 
455080061< 0.1%
 
470542531< 0.1%
 
312259611< 0.1%
 
416209121< 0.1%
 
60860651< 0.1%
 
399640831< 0.1%
 
452376901< 0.1%
 
139729561< 0.1%
 
427205151< 0.1%
 
373713251< 0.1%
 
422721941< 0.1%
 
206055701< 0.1%
 
438348251< 0.1%
 
224496111< 0.1%
 
135326211< 0.1%
 
253434441< 0.1%
 
73702451< 0.1%
 
269593211< 0.1%
 
377071611< 0.1%
 
Other values (6498)649899.6%
 
ValueCountFrequency (%) 
23841< 0.1%
 
45051< 0.1%
 
71261< 0.1%
 
98111< 0.1%
 
106101< 0.1%
 
109451< 0.1%
 
121401< 0.1%
 
223621< 0.1%
 
248331< 0.1%
 
258791< 0.1%
 
ValueCountFrequency (%) 
471411771< 0.1%
 
471402451< 0.1%
 
471374451< 0.1%
 
471263611< 0.1%
 
471263071< 0.1%
 
471239441< 0.1%
 
471214221< 0.1%
 
471181551< 0.1%
 
471168941< 0.1%
 
471151401< 0.1%
 

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct6351
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size51.1 KiB
Live + Work + Stay + Easy | 1BR in Chicago
 
18
UChicago, Shops + Eats, Lake | Gym + W&D | Zencity
 
14
Traveler's Dream - 1 bed in a shared bedroom
 
10
UChicago, Lake, Sci Museum | Gym + W&D | Zencity
 
9
Hotel Perks - Private Bedroom | Private Bathroom
 
8
Other values (6346)
6464 
ValueCountFrequency (%) 
Live + Work + Stay + Easy | 1BR in Chicago180.3%
 
UChicago, Shops + Eats, Lake | Gym + W&D | Zencity140.2%
 
Traveler's Dream - 1 bed in a shared bedroom100.2%
 
UChicago, Lake, Sci Museum | Gym + W&D | Zencity90.1%
 
Hotel Perks - Private Bedroom | Private Bathroom80.1%
 
Live + Work + Stay + Easy | 3BR in Chicago70.1%
 
Stylish & Ample 1PBR STAYCATION SPACE For You60.1%
 
Steps to MI Ave Shops | View, Beach, Gym | Zencity60.1%
 
Enjoy the Lakefront from a Cozy Retreat50.1%
 
Steps to Shop, Eat, Train | Easy Access | Zencity40.1%
 
Live + Work + Stay + Easy | 2BR in Chicago40.1%
 
A home you will love | 2BR in Chicago40.1%
 
Steps to UChicago | Easy Access + W&D | Zencity40.1%
 
Entire apartment for you | 2BR in Chicago3< 0.1%
 
Steps to Shops, Eats | Easy Access + W&D | Zencity3< 0.1%
 
BEST LOCATION EVER!WRIGLEY - BOYSTOWN - LAKEVIEW 13< 0.1%
 
SUPER EARLY CHECK IN AND SUPER LATE CHECK OUT3< 0.1%
 
XL Penthouse"The Harper"Book 6 Nights Get 1 FREE3< 0.1%
 
LAKEVIEW HEART! BOYSTOWN -WRIGLEY "HOSTEL STYLE" 23< 0.1%
 
All-inclusive apartment home | 3BR in Chicago3< 0.1%
 
Gritty Chic River North + ACME Hotel3< 0.1%
 
Classic HP 1BR with Fast Transit to UChicago & DT by Zen Rentals3< 0.1%
 
Professionally maintained apt | 2BR in Chicago3< 0.1%
 
5min to Wicker & DT | Lux Flat + W&D | Zencity3< 0.1%
 
Bright Loop 1BR w/ Gym, Pool, nr. Financial District, by Blueground3< 0.1%
 
Other values (6326)638897.9%
 
Frequencies of value counts

Unique

Unique6264 ?
Unique (%)96.0%
Histogram of lengths of the category

Length

Max length206
Median length45
Mean length41.40870765
Min length2

Overview of Unicode Properties

Unique unicode characters339
Unique unicode categories19 ?
Unique unicode scripts7 ?
Unique unicode blocks13 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
3867314.3%
 
e192597.1%
 
o185766.9%
 
a142045.3%
 
i139325.2%
 
t134615.0%
 
n132944.9%
 
r131914.9%
 
l72832.7%
 
s61522.3%
 
u57972.1%
 
d57302.1%
 
m50861.9%
 
c50441.9%
 
h48221.8%
 
g44361.6%
 
y41121.5%
 
B38641.4%
 
C36971.4%
 
p34961.3%
 
w33721.2%
 
S33271.2%
 
R31561.2%
 
L31431.2%
 
P29661.1%
 
Other values (314)5003618.5%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter17296064.0%
 
Uppercase Letter4346316.1%
 
Space Separator3872714.3%
 
Other Punctuation66032.4%
 
Decimal Number47701.8%
 
Dash Punctuation12880.5%
 
Math Symbol9990.4%
 
Other Symbol2840.1%
 
Other Letter2710.1%
 
Open Punctuation2620.1%
 
Close Punctuation2540.1%
 
Nonspacing Mark101< 0.1%
 
Final Punctuation60< 0.1%
 
Currency Symbol26< 0.1%
 
Control16< 0.1%
 
Initial Punctuation11< 0.1%
 
Spacing Mark9< 0.1%
 
Format3< 0.1%
 
Modifier Symbol2< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B38648.9%
 
C36978.5%
 
S33277.7%
 
R31567.3%
 
L31437.2%
 
P29666.8%
 
A25335.8%
 
E19824.6%
 
T19074.4%
 
H17844.1%
 
N16863.9%
 
O16563.8%
 
W16283.7%
 
M16213.7%
 
D15633.6%
 
G14323.3%
 
I12522.9%
 
F11872.7%
 
U9682.2%
 
V6951.6%
 
K4060.9%
 
Y3830.9%
 
Q2440.6%
 
Z1170.3%
 
X1170.3%
 
Other values (21)1490.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e1925911.1%
 
o1857610.7%
 
a142048.2%
 
i139328.1%
 
t134617.8%
 
n132947.7%
 
r131917.6%
 
l72834.2%
 
s61523.6%
 
u57973.4%
 
d57303.3%
 
m50862.9%
 
c50442.9%
 
h48222.8%
 
g44362.6%
 
y41122.4%
 
p34962.0%
 
w33721.9%
 
k28561.7%
 
f24521.4%
 
v22701.3%
 
b20741.2%
 
z7440.4%
 
x6240.4%
 
q4880.3%
 
Other values (41)2050.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3867399.9%
 
 530.1%
 
 1< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-126798.4%
 
110.9%
 
100.8%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,180027.3%
 
/153023.2%
 
!101015.3%
 
&73111.1%
 
.67510.2%
 
'2033.1%
 
#1592.4%
 
*1512.3%
 
:1021.5%
 
691.0%
 
"600.9%
 
@310.5%
 
;270.4%
 
?110.2%
 
100.2%
 
90.1%
 
%60.1%
 
60.1%
 
50.1%
 
40.1%
 
2< 0.1%
 
\1< 0.1%
 
1< 0.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2154932.5%
 
1118424.8%
 
369214.5%
 
03477.3%
 
43296.9%
 
52775.8%
 
61332.8%
 
9982.1%
 
8731.5%
 
7691.4%
 
50.1%
 
50.1%
 
50.1%
 
30.1%
 
1< 0.1%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(24493.1%
 
[72.7%
 
62.3%
 
31.1%
 
20.8%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)24194.9%
 
62.4%
 
41.6%
 
20.8%
 
]10.4%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
|52852.9%
 
+45045.0%
 
~131.3%
 
30.3%
 
10.1%
 
10.1%
 
=10.1%
 
<10.1%
 
>10.1%
 

Most frequent Other Symbol characters

ValueCountFrequency (%) 
5218.3%
 
4616.2%
 
3311.6%
 
155.3%
 
103.5%
 
🏠82.8%
 
72.5%
 
72.5%
 
62.1%
 
62.1%
 
💎51.8%
 
51.8%
 
51.8%
 
41.4%
 
41.4%
 
41.4%
 
41.4%
 
41.4%
 
41.4%
 
41.4%
 
31.1%
 
🥇31.1%
 
🌸31.1%
 
💗20.7%
 
20.7%
 
Other values (28)3813.4%
 

Most frequent Control characters

ValueCountFrequency (%) 
16100.0%
 

Most frequent Nonspacing Mark characters

ValueCountFrequency (%) 
8180.2%
 
98.9%
 
98.9%
 
ً22.0%
 

Most frequent Modifier Symbol characters

ValueCountFrequency (%) 
`2100.0%
 

Most frequent Initial Punctuation characters

ValueCountFrequency (%) 
981.8%
 
«19.1%
 
19.1%
 

Most frequent Final Punctuation characters

ValueCountFrequency (%) 
5185.0%
 
813.3%
 
»11.7%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$26100.0%
 

Most frequent Other Letter characters

ValueCountFrequency (%) 
ه114.1%
 
ل114.1%
 
ا114.1%
 
114.1%
 
103.7%
 
أ103.7%
 
103.7%
 
93.3%
 
ب93.3%
 
ك93.3%
 
93.3%
 
93.3%
 
93.3%
 
93.3%
 
93.3%
 
51.8%
 
41.5%
 
41.5%
 
41.5%
 
31.1%
 
31.1%
 
31.1%
 
20.7%
 
20.7%
 
20.7%
 
Other values (71)9334.3%
 

Most frequent Spacing Mark characters

ValueCountFrequency (%) 
9100.0%
 

Most frequent Format characters

ValueCountFrequency (%) 
3100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin21638580.1%
 
Common5334319.7%
 
Han1530.1%
 
Inherited83< 0.1%
 
Devanagari72< 0.1%
 
Arabic63< 0.1%
 
Hangul10< 0.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e192598.9%
 
o185768.6%
 
a142046.6%
 
i139326.4%
 
t134616.2%
 
n132946.1%
 
r131916.1%
 
l72833.4%
 
s61522.8%
 
u57972.7%
 
d57302.6%
 
m50862.4%
 
c50442.3%
 
h48222.2%
 
g44362.1%
 
y41121.9%
 
B38641.8%
 
C36971.7%
 
p34961.6%
 
w33721.6%
 
S33271.5%
 
R31561.5%
 
L31431.5%
 
P29661.4%
 
k28561.3%
 
Other values (63)3212914.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
3867372.5%
 
,18003.4%
 
215492.9%
 
/15302.9%
 
-12672.4%
 
111842.2%
 
!10101.9%
 
&7311.4%
 
36921.3%
 
.6751.3%
 
|5281.0%
 
+4500.8%
 
03470.7%
 
43290.6%
 
52770.5%
 
(2440.5%
 
)2410.5%
 
'2030.4%
 
#1590.3%
 
*1510.3%
 
61330.2%
 
:1020.2%
 
9980.2%
 
8730.1%
 
7690.1%
 
Other values (125)8281.6%
 

Most frequent Inherited characters

ValueCountFrequency (%) 
8197.6%
 
ً22.4%
 

Most frequent Han characters

ValueCountFrequency (%) 
117.2%
 
106.5%
 
106.5%
 
95.9%
 
53.3%
 
42.6%
 
42.6%
 
42.6%
 
32.0%
 
32.0%
 
32.0%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
Other values (53)5938.6%
 

Most frequent Arabic characters

ValueCountFrequency (%) 
ه1117.5%
 
ل1117.5%
 
ا1117.5%
 
أ1015.9%
 
ب914.3%
 
ك914.3%
 
و11.6%
 
س11.6%
 

Most frequent Devanagari characters

ValueCountFrequency (%) 
912.5%
 
912.5%
 
912.5%
 
912.5%
 
912.5%
 
912.5%
 
912.5%
 
912.5%
 

Most frequent Hangul characters

ValueCountFrequency (%) 
220.0%
 
220.0%
 
220.0%
 
220.0%
 
220.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII26890599.6%
 
None4320.2%
 
Punctuation1590.1%
 
CJK1530.1%
 
Misc Symbols97< 0.1%
 
Dingbats85< 0.1%
 
VS81< 0.1%
 
Devanagari72< 0.1%
 
Arabic65< 0.1%
 
Math Alphanum38< 0.1%
 
Hangul10< 0.1%
 
Block Elements6< 0.1%
 
Geometric Shapes6< 0.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
3867314.4%
 
e192597.2%
 
o185766.9%
 
a142045.3%
 
i139325.2%
 
t134615.0%
 
n132944.9%
 
r131914.9%
 
l72832.7%
 
s61522.3%
 
u57972.2%
 
d57302.1%
 
m50861.9%
 
c50441.9%
 
h48221.8%
 
g44361.6%
 
y41121.5%
 
B38641.4%
 
C36971.4%
 
p34961.3%
 
w33721.3%
 
S33271.2%
 
R31561.2%
 
L31431.2%
 
P29661.1%
 
Other values (67)4883218.2%
 

Most frequent Misc Symbols characters

ValueCountFrequency (%) 
5253.6%
 
1010.3%
 
77.2%
 
55.2%
 
44.1%
 
44.1%
 
44.1%
 
33.1%
 
22.1%
 
11.0%
 
11.0%
 
11.0%
 
11.0%
 
11.0%
 
11.0%
 

Most frequent None characters

ValueCountFrequency (%) 
 5312.3%
 
337.6%
 
194.4%
 
153.5%
 
143.2%
 
133.0%
 
122.8%
 
112.5%
 
112.5%
 
ó102.3%
 
102.3%
 
102.3%
 
92.1%
 
92.1%
 
🏠81.9%
 
71.6%
 
61.4%
 
61.4%
 
61.4%
 
61.4%
 
61.4%
 
61.4%
 
💎51.2%
 
51.2%
 
51.2%
 
Other values (60)13731.7%
 

Most frequent Dingbats characters

ValueCountFrequency (%) 
4654.1%
 
78.2%
 
55.9%
 
44.7%
 
44.7%
 
44.7%
 
44.7%
 
22.4%
 
22.4%
 
22.4%
 
22.4%
 
22.4%
 
11.2%
 

Most frequent VS characters

ValueCountFrequency (%) 
81100.0%
 

Most frequent Punctuation characters

ValueCountFrequency (%) 
6943.4%
 
5132.1%
 
116.9%
 
106.3%
 
95.7%
 
85.0%
 
10.6%
 

Most frequent CJK characters

ValueCountFrequency (%) 
117.2%
 
106.5%
 
106.5%
 
95.9%
 
53.3%
 
42.6%
 
42.6%
 
42.6%
 
32.0%
 
32.0%
 
32.0%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
21.3%
 
Other values (53)5938.6%
 

Most frequent Arabic characters

ValueCountFrequency (%) 
ه1116.9%
 
ل1116.9%
 
ا1116.9%
 
أ1015.4%
 
ب913.8%
 
ك913.8%
 
ً23.1%
 
و11.5%
 
س11.5%
 

Most frequent Devanagari characters

ValueCountFrequency (%) 
912.5%
 
912.5%
 
912.5%
 
912.5%
 
912.5%
 
912.5%
 
912.5%
 
912.5%
 

Most frequent Block Elements characters

ValueCountFrequency (%) 
6100.0%
 

Most frequent Hangul characters

ValueCountFrequency (%) 
220.0%
 
220.0%
 
220.0%
 
220.0%
 
220.0%
 

Most frequent Math Alphanum characters

ValueCountFrequency (%) 
𝗋37.9%
 
𝖺37.9%
 
𝗂37.9%
 
𝖻25.3%
 
𝗇25.3%
 
𝗍25.3%
 
𝗎25.3%
 
𝖽25.3%
 
𝗈25.3%
 
𝗄25.3%
 
𝖤25.3%
 
𝖴12.6%
 
𝖢12.6%
 
𝗁12.6%
 
𝖼12.6%
 
𝖲12.6%
 
𝖧12.6%
 
𝗆12.6%
 
𝗅12.6%
 
𝖯12.6%
 
𝖥12.6%
 
𝖱12.6%
 
𝗉12.6%
 
𝗀12.6%
 

Most frequent Geometric Shapes characters

ValueCountFrequency (%) 
6100.0%
 

host_id
Real number (ℝ≥0)

Distinct3553
Distinct (%)54.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean106119284.1
Minimum2140
Maximum380761555
Zeros0
Zeros (%)0.0%
Memory size51.1 KiB

Quantile statistics

Minimum2140
5-th percentile1523290
Q118607228.5
median63226994
Q3170785489
95-th percentile337787048.2
Maximum380761555
Range380759415
Interquartile range (IQR)152178260.5

Descriptive statistics

Standard deviation106308403.9
Coefficient of variation (CV)1.001782144
Kurtosis-0.1264015088
Mean106119284.1
Median Absolute Deviation (MAD)54894837
Skewness0.9918141307
Sum6.922160899e+11
Variance1.130147675e+16
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1074344232163.3%
 
3965428741.1%
 
47172572631.0%
 
12243051500.8%
 
359234447490.8%
 
170785489470.7%
 
8534462400.6%
 
9094538400.6%
 
166918192350.5%
 
49626033310.5%
 
63313003300.5%
 
148973907300.5%
 
100782278290.4%
 
33127842260.4%
 
217094024250.4%
 
57387860250.4%
 
371036931240.4%
 
98193524200.3%
 
244000490200.3%
 
683529200.3%
 
178710732190.3%
 
257464365190.3%
 
35781467180.3%
 
2907254180.3%
 
154630260180.3%
 
Other values (3528)553784.9%
 
ValueCountFrequency (%) 
21402< 0.1%
 
215360.1%
 
26131< 0.1%
 
443450.1%
 
57751< 0.1%
 
61621< 0.1%
 
93011< 0.1%
 
112781< 0.1%
 
130141< 0.1%
 
179281< 0.1%
 
ValueCountFrequency (%) 
3807615551< 0.1%
 
3804370011< 0.1%
 
3803533931< 0.1%
 
3793123682< 0.1%
 
3790391291< 0.1%
 
3786667531< 0.1%
 
3782453941< 0.1%
 
3781423751< 0.1%
 
3779958442< 0.1%
 
3779806841< 0.1%
 

host_name
Categorical

HIGH CARDINALITY

Distinct1902
Distinct (%)29.2%
Missing0
Missing (%)0.0%
Memory size51.1 KiB
Blueground
 
216
Rob
 
82
Zencity
 
63
Joe
 
63
John
 
61
Other values (1897)
6038 
ValueCountFrequency (%) 
Blueground2163.3%
 
Rob821.3%
 
Zencity631.0%
 
Joe631.0%
 
John610.9%
 
Michael600.9%
 
Nicole580.9%
 
Kia530.8%
 
Sonder500.8%
 
David470.7%
 
Dmd470.7%
 
Alex440.7%
 
Helen410.6%
 
Barsala400.6%
 
Brad & Sara350.5%
 
William340.5%
 
Dan340.5%
 
Matthew330.5%
 
Roma310.5%
 
Mia & Noah300.5%
 
K300.5%
 
Matt290.4%
 
Emily & Rich290.4%
 
Kari280.4%
 
Kevin280.4%
 
Other values (1877)525780.6%
 
Frequencies of value counts

Unique

Unique1071 ?
Unique (%)16.4%
Histogram of lengths of the category

Length

Max length35
Median length6
Mean length6.478460831
Min length1

Overview of Unicode Properties

Unique unicode characters79
Unique unicode categories11 ?
Unique unicode scripts4 ?
Unique unicode blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a472711.2%
 
e38839.2%
 
n34448.1%
 
i29877.1%
 
r24935.9%
 
o22735.4%
 
l21485.1%
 
t15413.6%
 
13163.1%
 
d12112.9%
 
h12042.8%
 
s11262.7%
 
u10312.4%
 
y10182.4%
 
c8532.0%
 
A8021.9%
 
J7881.9%
 
m7001.7%
 
M6181.5%
 
S5541.3%
 
g5451.3%
 
B5431.3%
 
R5201.2%
 
K4421.0%
 
D4391.0%
 
Other values (54)505312.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter3287277.8%
 
Uppercase Letter763218.1%
 
Space Separator13163.1%
 
Other Punctuation3650.9%
 
Math Symbol390.1%
 
Open Punctuation10< 0.1%
 
Close Punctuation10< 0.1%
 
Dash Punctuation9< 0.1%
 
Decimal Number3< 0.1%
 
Other Letter2< 0.1%
 
Modifier Symbol1< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A80210.5%
 
J78810.3%
 
M6188.1%
 
S5547.3%
 
B5437.1%
 
R5206.8%
 
K4425.8%
 
D4395.8%
 
C4345.7%
 
T2983.9%
 
L2913.8%
 
N2743.6%
 
E2723.6%
 
H2313.0%
 
P1932.5%
 
G1822.4%
 
F1592.1%
 
W1381.8%
 
Z1101.4%
 
V1071.4%
 
I961.3%
 
O700.9%
 
Y480.6%
 
X100.1%
 
U70.1%
 
Other values (2)60.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a472714.4%
 
e388311.8%
 
n344410.5%
 
i29879.1%
 
r24937.6%
 
o22736.9%
 
l21486.5%
 
t15414.7%
 
d12113.7%
 
h12043.7%
 
s11263.4%
 
u10313.1%
 
y10183.1%
 
c8532.6%
 
m7002.1%
 
g5451.7%
 
b3681.1%
 
k2630.8%
 
v2350.7%
 
f1790.5%
 
w1410.4%
 
z1310.4%
 
p1260.4%
 
x1120.3%
 
j850.3%
 
Other values (13)480.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1316100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
&31686.6%
 
.3910.7%
 
/41.1%
 
'30.8%
 
,30.8%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
+39100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-9100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(10100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)10100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
93100.0%
 

Most frequent Other Letter characters

ValueCountFrequency (%) 
姿150.0%
 
150.0%
 

Most frequent Modifier Symbol characters

ValueCountFrequency (%) 
`1100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin4050095.8%
 
Common17534.1%
 
Cyrillic4< 0.1%
 
Han2< 0.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a472711.7%
 
e38839.6%
 
n34448.5%
 
i29877.4%
 
r24936.2%
 
o22735.6%
 
l21485.3%
 
t15413.8%
 
d12113.0%
 
h12043.0%
 
s11262.8%
 
u10312.5%
 
y10182.5%
 
c8532.1%
 
A8022.0%
 
J7881.9%
 
m7001.7%
 
M6181.5%
 
S5541.4%
 
g5451.3%
 
B5431.3%
 
R5201.3%
 
K4421.1%
 
D4391.1%
 
C4341.1%
 
Other values (36)417610.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
131675.1%
 
&31618.0%
 
.392.2%
 
+392.2%
 
(100.6%
 
)100.6%
 
-90.5%
 
/40.2%
 
'30.2%
 
,30.2%
 
930.2%
 
`10.1%
 

Most frequent Han characters

ValueCountFrequency (%) 
姿150.0%
 
150.0%
 

Most frequent Cyrillic characters

ValueCountFrequency (%) 
Ю125.0%
 
р125.0%
 
и125.0%
 
й125.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4223299.9%
 
None20< 0.1%
 
Cyrillic4< 0.1%
 
CJK2< 0.1%
 
Latin Ext Additional1< 0.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a472711.2%
 
e38839.2%
 
n34448.2%
 
i29877.1%
 
r24935.9%
 
o22735.4%
 
l21485.1%
 
t15413.6%
 
13163.1%
 
d12112.9%
 
h12042.9%
 
s11262.7%
 
u10312.4%
 
y10182.4%
 
c8532.0%
 
A8021.9%
 
J7881.9%
 
m7001.7%
 
M6181.5%
 
S5541.3%
 
g5451.3%
 
B5431.3%
 
R5201.2%
 
K4421.0%
 
D4391.0%
 
Other values (39)502611.9%
 

Most frequent None characters

ValueCountFrequency (%) 
è1050.0%
 
é420.0%
 
ñ15.0%
 
ư15.0%
 
ơ15.0%
 
ï15.0%
 
ö15.0%
 
á15.0%
 

Most frequent Latin Ext Additional characters

ValueCountFrequency (%) 
1100.0%
 

Most frequent CJK characters

ValueCountFrequency (%) 
姿150.0%
 
150.0%
 

Most frequent Cyrillic characters

ValueCountFrequency (%) 
Ю125.0%
 
р125.0%
 
и125.0%
 
й125.0%
 

neighbourhood
Categorical

HIGH CARDINALITY

Distinct77
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size51.1 KiB
Near North Side
748 
West Town
730 
Lake View
581 
Logan Square
382 
Loop
 
344
Other values (72)
3738 
ValueCountFrequency (%) 
Near North Side74811.5%
 
West Town73011.2%
 
Lake View5818.9%
 
Logan Square3825.9%
 
Loop3445.3%
 
Near West Side3375.2%
 
Lincoln Park3134.8%
 
Lower West Side1902.9%
 
Uptown1822.8%
 
Edgewater1662.5%
 
Irving Park1582.4%
 
Avondale1402.1%
 
Near South Side1352.1%
 
North Center1271.9%
 
Rogers Park1201.8%
 
Bridgeport1151.8%
 
Grand Boulevard1121.7%
 
Hyde Park1041.6%
 
East Garfield Park1031.6%
 
Lincoln Square921.4%
 
Woodlawn911.4%
 
South Shore841.3%
 
West Ridge811.2%
 
Portage Park791.2%
 
Armour Square771.2%
 
Other values (52)93214.3%
 
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length22
Median length11
Mean length11.13552047
Min length4

Overview of Unicode Properties

Unique unicode characters46
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e802111.0%
 
67649.3%
 
r60108.3%
 
a55167.6%
 
o53417.4%
 
t37205.1%
 
n34954.8%
 
i32264.4%
 
d28033.9%
 
S23333.2%
 
N21963.0%
 
w21483.0%
 
L19912.7%
 
s19852.7%
 
k18522.5%
 
W15082.1%
 
h14802.0%
 
g14422.0%
 
l13301.8%
 
P13191.8%
 
u13151.8%
 
T7301.0%
 
p6410.9%
 
V5810.8%
 
q5510.8%
 
Other values (21)43396.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter5258672.4%
 
Uppercase Letter1328718.3%
 
Space Separator67649.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S233317.6%
 
N219616.5%
 
L199115.0%
 
W150811.3%
 
P13199.9%
 
T7305.5%
 
V5814.4%
 
G3642.7%
 
A3532.7%
 
E2962.2%
 
B2842.1%
 
C2812.1%
 
H2291.7%
 
R2231.7%
 
U1821.4%
 
I1581.2%
 
D970.7%
 
K440.3%
 
M410.3%
 
J380.3%
 
O220.2%
 
F170.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e802115.3%
 
r601011.4%
 
a551610.5%
 
o534110.2%
 
t37207.1%
 
n34956.6%
 
i32266.1%
 
d28035.3%
 
w21484.1%
 
s19853.8%
 
k18523.5%
 
h14802.8%
 
g14422.7%
 
l13302.5%
 
u13152.5%
 
p6411.2%
 
q5511.0%
 
c4760.9%
 
v4200.8%
 
m2430.5%
 
y2100.4%
 
f2080.4%
 
b1530.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
6764100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin6587390.7%
 
Common67649.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e802112.2%
 
r60109.1%
 
a55168.4%
 
o53418.1%
 
t37205.6%
 
n34955.3%
 
i32264.9%
 
d28034.3%
 
S23333.5%
 
N21963.3%
 
w21483.3%
 
L19913.0%
 
s19853.0%
 
k18522.8%
 
W15082.3%
 
h14802.2%
 
g14422.2%
 
l13302.0%
 
P13192.0%
 
u13152.0%
 
T7301.1%
 
p6411.0%
 
V5810.9%
 
q5510.8%
 
c4760.7%
 
Other values (20)38635.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
6764100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII72637100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e802111.0%
 
67649.3%
 
r60108.3%
 
a55167.6%
 
o53417.4%
 
t37205.1%
 
n34954.8%
 
i32264.4%
 
d28033.9%
 
S23333.2%
 
N21963.0%
 
w21483.0%
 
L19912.7%
 
s19852.7%
 
k18522.5%
 
W15082.1%
 
h14802.0%
 
g14422.0%
 
l13301.8%
 
P13191.8%
 
u13151.8%
 
T7301.0%
 
p6410.9%
 
V5810.8%
 
q5510.8%
 
Other values (21)43396.0%
 

latitude
Real number (ℝ≥0)

Distinct5168
Distinct (%)79.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.89871965
Minimum41.65156
Maximum42.02259
Zeros0
Zeros (%)0.0%
Memory size51.1 KiB

Quantile statistics

Minimum41.65156
5-th percentile41.782255
Q141.87348
median41.90143
Q341.939765
95-th percentile41.987144
Maximum42.02259
Range0.37103
Interquartile range (IQR)0.066285

Descriptive statistics

Standard deviation0.05904695304
Coefficient of variation (CV)0.00140927822
Kurtosis0.815250805
Mean41.89871965
Median Absolute Deviation (MAD)0.03475
Skewness-0.7368105194
Sum273305.3483
Variance0.003486542663
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
41.88306350.5%
 
41.88608310.5%
 
41.89111300.5%
 
42.01653180.3%
 
41.88558160.2%
 
41.89063130.2%
 
41.88302120.2%
 
41.8989120.2%
 
41.89622120.2%
 
41.89862110.2%
 
41.89235110.2%
 
41.88606110.2%
 
41.94041110.2%
 
41.88309100.2%
 
41.8945390.1%
 
41.9032380.1%
 
41.8950280.1%
 
41.9089580.1%
 
41.8771170.1%
 
41.8772370.1%
 
41.9065370.1%
 
41.8962170.1%
 
41.8990270.1%
 
41.8998860.1%
 
41.9221460.1%
 
Other values (5143)621095.2%
 
ValueCountFrequency (%) 
41.651561< 0.1%
 
41.652411< 0.1%
 
41.653011< 0.1%
 
41.653671< 0.1%
 
41.656481< 0.1%
 
41.665191< 0.1%
 
41.685821< 0.1%
 
41.686051< 0.1%
 
41.686121< 0.1%
 
41.687931< 0.1%
 
ValueCountFrequency (%) 
42.022591< 0.1%
 
42.022111< 0.1%
 
42.021971< 0.1%
 
42.021581< 0.1%
 
42.021391< 0.1%
 
42.021321< 0.1%
 
42.020921< 0.1%
 
42.020741< 0.1%
 
42.020731< 0.1%
 
42.020261< 0.1%
 

longitude
Real number (ℝ)

Distinct4981
Distinct (%)76.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-87.66339754
Minimum-87.93434
Maximum-87.53782
Zeros0
Zeros (%)0.0%
Memory size51.1 KiB

Quantile statistics

Minimum-87.93434
5-th percentile-87.735849
Q1-87.68666
median-87.65959
Q3-87.632985
95-th percentile-87.604728
Maximum-87.53782
Range0.39652
Interquartile range (IQR)0.053675

Descriptive statistics

Standard deviation0.04238696923
Coefficient of variation (CV)-0.0004835195808
Kurtosis1.407428092
Mean-87.66339754
Median Absolute Deviation (MAD)0.02671
Skewness-0.7005122763
Sum-571828.3422
Variance0.001796655161
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
-87.63422310.5%
 
-87.62205300.5%
 
-87.65131300.5%
 
-87.6257160.2%
 
-87.61903130.2%
 
-87.68778120.2%
 
-87.62472120.2%
 
-87.62571110.2%
 
-87.62832100.2%
 
-87.62138100.2%
 
-87.62732100.2%
 
-87.6334190.1%
 
-87.6279790.1%
 
-87.6338580.1%
 
-87.709680.1%
 
-87.6290170.1%
 
-87.641570.1%
 
-87.6504870.1%
 
-87.6278570.1%
 
-87.6279170.1%
 
-87.6257760.1%
 
-87.6677760.1%
 
-87.6300960.1%
 
-87.652560.1%
 
-87.6737160.1%
 
Other values (4956)623995.6%
 
ValueCountFrequency (%) 
-87.934341< 0.1%
 
-87.846741< 0.1%
 
-87.845461< 0.1%
 
-87.84431< 0.1%
 
-87.843211< 0.1%
 
-87.841931< 0.1%
 
-87.841321< 0.1%
 
-87.836991< 0.1%
 
-87.835611< 0.1%
 
-87.835281< 0.1%
 
ValueCountFrequency (%) 
-87.537821< 0.1%
 
-87.539421< 0.1%
 
-87.541651< 0.1%
 
-87.544231< 0.1%
 
-87.544241< 0.1%
 
-87.545421< 0.1%
 
-87.545591< 0.1%
 
-87.545611< 0.1%
 
-87.545711< 0.1%
 
-87.545971< 0.1%
 

room_type
Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size51.1 KiB
Entire home/apt
4510 
Private room
1848 
Shared room
 
94
Hotel room
 
71
ValueCountFrequency (%) 
Entire home/apt451069.1%
 
Private room184828.3%
 
Shared room941.4%
 
Hotel room711.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length15
Median length15
Mean length14.03801932
Min length10

Overview of Unicode Properties

Unique unicode characters19
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e1103312.0%
 
t1093911.9%
 
o86079.4%
 
r84659.2%
 
65237.1%
 
m65237.1%
 
a64527.0%
 
i63586.9%
 
h46045.0%
 
E45104.9%
 
n45104.9%
 
/45104.9%
 
p45104.9%
 
P18482.0%
 
v18482.0%
 
S940.1%
 
d940.1%
 
H710.1%
 
l710.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter7401480.8%
 
Uppercase Letter65237.1%
 
Space Separator65237.1%
 
Other Punctuation45104.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
E451069.1%
 
P184828.3%
 
S941.4%
 
H711.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e1103314.9%
 
t1093914.8%
 
o860711.6%
 
r846511.4%
 
m65238.8%
 
a64528.7%
 
i63588.6%
 
h46046.2%
 
n45106.1%
 
p45106.1%
 
v18482.5%
 
d940.1%
 
l710.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
6523100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/4510100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin8053788.0%
 
Common1103312.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e1103313.7%
 
t1093913.6%
 
o860710.7%
 
r846510.5%
 
m65238.1%
 
a64528.0%
 
i63587.9%
 
h46045.7%
 
E45105.6%
 
n45105.6%
 
p45105.6%
 
P18482.3%
 
v18482.3%
 
S940.1%
 
d940.1%
 
H710.1%
 
l710.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
652359.1%
 
/451040.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII91570100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e1103312.0%
 
t1093911.9%
 
o86079.4%
 
r84659.2%
 
65237.1%
 
m65237.1%
 
a64527.0%
 
i63586.9%
 
h46045.0%
 
E45104.9%
 
n45104.9%
 
/45104.9%
 
p45104.9%
 
P18482.0%
 
v18482.0%
 
S940.1%
 
d940.1%
 
H710.1%
 
l710.1%
 

price
Real number (ℝ≥0)

SKEWED

Distinct504
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean150.062088
Minimum0
Maximum10000
Zeros5
Zeros (%)0.1%
Memory size51.1 KiB

Quantile statistics

Minimum0
5-th percentile30
Q160
median94
Q3150
95-th percentile400
Maximum10000
Range10000
Interquartile range (IQR)90

Descriptive statistics

Standard deviation371.5814529
Coefficient of variation (CV)2.476184743
Kurtosis507.6916875
Mean150.062088
Median Absolute Deviation (MAD)42
Skewness20.25399533
Sum978855
Variance138072.7761
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
751672.6%
 
501442.2%
 
1001382.1%
 
651191.8%
 
601141.7%
 
801141.7%
 
701101.7%
 
1501101.7%
 
451071.6%
 
851001.5%
 
99961.5%
 
200911.4%
 
35881.3%
 
90861.3%
 
55861.3%
 
125821.3%
 
95811.2%
 
120741.1%
 
110701.1%
 
89681.0%
 
49671.0%
 
40661.0%
 
30651.0%
 
400651.0%
 
59641.0%
 
Other values (479)415163.6%
 
ValueCountFrequency (%) 
050.1%
 
103< 0.1%
 
121< 0.1%
 
1440.1%
 
1580.1%
 
1660.1%
 
1790.1%
 
1850.1%
 
1970.1%
 
20140.2%
 
ValueCountFrequency (%) 
100001< 0.1%
 
999950.1%
 
90001< 0.1%
 
36901< 0.1%
 
35001< 0.1%
 
34291< 0.1%
 
30701< 0.1%
 
30001< 0.1%
 
27731< 0.1%
 
25071< 0.1%
 

minimum_nights
Real number (ℝ≥0)

Distinct61
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.231488579
Minimum1
Maximum500
Zeros0
Zeros (%)0.0%
Memory size51.1 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile32
Maximum500
Range499
Interquartile range (IQR)3

Descriptive statistics

Standard deviation22.38369529
Coefficient of variation (CV)2.719276723
Kurtosis161.0444242
Mean8.231488579
Median Absolute Deviation (MAD)1
Skewness10.65131132
Sum53694
Variance501.0298148
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2212132.5%
 
1200030.7%
 
376511.7%
 
303986.1%
 
41752.7%
 
71652.5%
 
311412.2%
 
321352.1%
 
51201.8%
 
14681.0%
 
33600.9%
 
60510.8%
 
10510.8%
 
28470.7%
 
6310.5%
 
90190.3%
 
20180.3%
 
15170.3%
 
41160.2%
 
21120.2%
 
365110.2%
 
8100.2%
 
2990.1%
 
2580.1%
 
2770.1%
 
Other values (36)681.0%
 
ValueCountFrequency (%) 
1200030.7%
 
2212132.5%
 
376511.7%
 
41752.7%
 
51201.8%
 
6310.5%
 
71652.5%
 
8100.2%
 
93< 0.1%
 
10510.8%
 
ValueCountFrequency (%) 
5001< 0.1%
 
365110.2%
 
3601< 0.1%
 
2101< 0.1%
 
2001< 0.1%
 
1851< 0.1%
 
1821< 0.1%
 
18070.1%
 
1791< 0.1%
 
1681< 0.1%
 

number_of_reviews
Real number (ℝ≥0)

ZEROS

Distinct338
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.67162349
Minimum0
Maximum655
Zeros1285
Zeros (%)19.7%
Memory size51.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median13
Q353
95-th percentile176.9
Maximum655
Range655
Interquartile range (IQR)52

Descriptive statistics

Standard deviation67.2569877
Coefficient of variation (CV)1.6139757
Kurtosis12.65643683
Mean41.67162349
Median Absolute Deviation (MAD)13
Skewness2.988450645
Sum271824
Variance4523.502395
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0128519.7%
 
14677.2%
 
22764.2%
 
32363.6%
 
41782.7%
 
51412.2%
 
61271.9%
 
71061.6%
 
81001.5%
 
9841.3%
 
13831.3%
 
10831.3%
 
12781.2%
 
16701.1%
 
11631.0%
 
24580.9%
 
20580.9%
 
18570.9%
 
15570.9%
 
14560.9%
 
25560.9%
 
19550.8%
 
21540.8%
 
23530.8%
 
17530.8%
 
Other values (313)258939.7%
 
ValueCountFrequency (%) 
0128519.7%
 
14677.2%
 
22764.2%
 
32363.6%
 
41782.7%
 
51412.2%
 
61271.9%
 
71061.6%
 
81001.5%
 
9841.3%
 
ValueCountFrequency (%) 
6551< 0.1%
 
6411< 0.1%
 
6261< 0.1%
 
5701< 0.1%
 
5411< 0.1%
 
5241< 0.1%
 
5181< 0.1%
 
5131< 0.1%
 
5081< 0.1%
 
5051< 0.1%
 

last_review
Categorical

HIGH CARDINALITY
MISSING

Distinct820
Distinct (%)15.7%
Missing1285
Missing (%)19.7%
Memory size51.1 KiB
11/29/20
 
159
12/13/20
 
117
3/15/20
 
110
11/30/20
 
95
2/16/20
 
95
Other values (815)
4662 
ValueCountFrequency (%) 
11/29/201592.4%
 
12/13/201171.8%
 
3/15/201101.7%
 
11/30/20951.5%
 
2/16/20951.5%
 
11/28/20931.4%
 
12/6/20801.2%
 
11/15/20681.0%
 
11/8/20671.0%
 
10/25/20631.0%
 
11/22/20621.0%
 
12/14/20590.9%
 
12/5/20510.8%
 
10/18/20490.8%
 
10/20/19480.7%
 
12/7/20450.7%
 
12/15/20450.7%
 
11/27/20440.7%
 
2/17/20440.7%
 
10/31/20430.7%
 
11/23/20430.7%
 
3/16/20430.7%
 
10/11/20430.7%
 
12/1/20410.6%
 
11/21/20400.6%
 
Other values (795)359155.1%
 
(Missing)128519.7%
 
Frequencies of value counts

Unique

Unique340 ?
Unique (%)6.5%
Histogram of lengths of the category

Length

Max length8
Median length7
Mean length6.463437069
Min length3

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
/1047624.8%
 
1826119.6%
 
2721217.1%
 
0542512.9%
 
n25706.1%
 
917824.2%
 
312923.1%
 
a12853.0%
 
810082.4%
 
77951.9%
 
67661.8%
 
57251.7%
 
45641.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2783066.0%
 
Other Punctuation1047624.8%
 
Lowercase Letter38559.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
1826129.7%
 
2721225.9%
 
0542519.5%
 
917826.4%
 
312924.6%
 
810083.6%
 
77952.9%
 
67662.8%
 
57252.6%
 
45642.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/10476100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n257066.7%
 
a128533.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common3830690.9%
 
Latin38559.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
/1047627.3%
 
1826121.6%
 
2721218.8%
 
0542514.2%
 
917824.7%
 
312923.4%
 
810082.6%
 
77952.1%
 
67662.0%
 
57251.9%
 
45641.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n257066.7%
 
a128533.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII42161100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
/1047624.8%
 
1826119.6%
 
2721217.1%
 
0542512.9%
 
n25706.1%
 
917824.2%
 
312923.1%
 
a12853.0%
 
810082.4%
 
77951.9%
 
67661.8%
 
57251.7%
 
45641.3%
 

reviews_per_month
Real number (ℝ≥0)

MISSING

Distinct644
Distinct (%)12.3%
Missing1285
Missing (%)19.7%
Infinite0
Infinite (%)0.0%
Mean1.65593929
Minimum0.01
Maximum32.41
Zeros0
Zeros (%)0.0%
Memory size51.1 KiB

Quantile statistics

Minimum0.01
5-th percentile0.08
Q10.39
median1.12
Q32.45
95-th percentile4.7815
Maximum32.41
Range32.4
Interquartile range (IQR)2.06

Descriptive statistics

Standard deviation1.727131468
Coefficient of variation (CV)1.042992022
Kurtosis31.03968472
Mean1.65593929
Median Absolute Deviation (MAD)0.87
Skewness3.231401101
Sum8673.81
Variance2.982983106
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
11111.7%
 
0.1550.8%
 
0.06520.8%
 
0.11510.8%
 
0.19510.8%
 
0.07500.8%
 
0.08490.8%
 
0.16490.8%
 
0.21470.7%
 
0.03440.7%
 
0.13440.7%
 
0.17440.7%
 
0.09430.7%
 
0.05380.6%
 
0.14380.6%
 
0.29370.6%
 
0.12370.6%
 
0.2350.5%
 
0.31340.5%
 
0.38320.5%
 
0.18320.5%
 
0.25320.5%
 
0.28310.5%
 
2300.5%
 
0.04290.4%
 
Other values (619)414363.5%
 
(Missing)128519.7%
 
ValueCountFrequency (%) 
0.011< 0.1%
 
0.02200.3%
 
0.03440.7%
 
0.04290.4%
 
0.05380.6%
 
0.06520.8%
 
0.07500.8%
 
0.08490.8%
 
0.09430.7%
 
0.1550.8%
 
ValueCountFrequency (%) 
32.411< 0.1%
 
24.341< 0.1%
 
19.561< 0.1%
 
17.161< 0.1%
 
13.861< 0.1%
 
13.741< 0.1%
 
12.081< 0.1%
 
11.351< 0.1%
 
11.031< 0.1%
 
111< 0.1%
 

calculated_host_listings_count
Real number (ℝ≥0)

Distinct34
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.44718688
Minimum1
Maximum216
Zeros0
Zeros (%)0.0%
Memory size51.1 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q38
95-th percentile63
Maximum216
Range215
Interquartile range (IQR)7

Descriptive statistics

Standard deviation39.62176771
Coefficient of variation (CV)2.742524759
Kurtosis19.29870307
Mean14.44718688
Median Absolute Deviation (MAD)1
Skewness4.413201886
Sum94239
Variance1569.884477
MonotocityNot monotonic
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%) 
1270841.5%
 
286013.2%
 
34416.8%
 
43765.8%
 
52353.6%
 
2162163.3%
 
61742.7%
 
121201.8%
 
91171.8%
 
8961.5%
 
7911.4%
 
40801.2%
 
74741.1%
 
10701.1%
 
11661.0%
 
63631.0%
 
15600.9%
 
20600.9%
 
30600.9%
 
18540.8%
 
25500.8%
 
50500.8%
 
49490.8%
 
16480.7%
 
47470.7%
 
Other values (9)2584.0%
 
ValueCountFrequency (%) 
1270841.5%
 
286013.2%
 
34416.8%
 
43765.8%
 
52353.6%
 
61742.7%
 
7911.4%
 
8961.5%
 
91171.8%
 
10701.1%
 
ValueCountFrequency (%) 
2162163.3%
 
74741.1%
 
63631.0%
 
50500.8%
 
49490.8%
 
47470.7%
 
40801.2%
 
35350.5%
 
31310.5%
 
30600.9%
 

availability_365
Real number (ℝ≥0)

ZEROS

Distinct361
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.5874598
Minimum0
Maximum365
Zeros1797
Zeros (%)27.5%
Memory size51.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median123
Q3333
95-th percentile365
Maximum365
Range365
Interquartile range (IQR)333

Descriptive statistics

Standard deviation144.3194377
Coefficient of variation (CV)0.8986968094
Kurtosis-1.550568682
Mean160.5874598
Median Absolute Deviation (MAD)123
Skewness0.271537519
Sum1047512
Variance20828.1001
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0179727.5%
 
3655077.8%
 
901642.5%
 
3641622.5%
 
1801261.9%
 
3531061.6%
 
891011.5%
 
78891.4%
 
363821.3%
 
179741.1%
 
88671.0%
 
360610.9%
 
168500.8%
 
362490.8%
 
358460.7%
 
361460.7%
 
322440.7%
 
178430.7%
 
354380.6%
 
359360.6%
 
352350.5%
 
356340.5%
 
351330.5%
 
294330.5%
 
83330.5%
 
Other values (336)266740.9%
 
ValueCountFrequency (%) 
0179727.5%
 
1270.4%
 
240.1%
 
360.1%
 
470.1%
 
560.1%
 
6100.2%
 
790.1%
 
860.1%
 
940.1%
 
ValueCountFrequency (%) 
3655077.8%
 
3641622.5%
 
363821.3%
 
362490.8%
 
361460.7%
 
360610.9%
 
359360.6%
 
358460.7%
 
357320.5%
 
356340.5%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

idnamehost_idhost_nameneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
02384Hyde Park - Walk to UChicago, 10 min to McCormick2613RebeccaHyde Park41.78790-87.58780Private room70218110/29/202.5010
14505394 Great Reviews. 127 y/o House. 40 yds to train.5775Craig & KathleenSouth Lawndale41.85495-87.69696Entire home/apt9523957/14/202.751170
27126Tiny Studio Apartment 94 Walk Score17928SarahWest Town41.90289-87.68182Entire home/apt60238711/16/202.7710
39811Barbara's Hideaway - Old Town33004At Home InnLincoln Park41.91769-87.63788Entire home/apt6545311/30/200.6511276
4106103 Comforts of Cooperative Living2140LoisHyde Park41.79612-87.59261Private room201459/15/200.6020
510945The Biddle House (#1)33004At Home InnLincoln Park41.91183-87.64000Entire home/apt11642111/21/200.261183
612140Lincoln Park Guest House46734Sharon And RobertLincoln Park41.92335-87.64951Private room2892410/17/180.061179
722362Luxury in Chicago! 2BR/ 2Ba / Parking / BBQ85811CraigWest Town41.89617-87.66041Entire home/apt9991910/12/140.112365
824833Private Apt 1 Block to Fullerton L Red Line - Deck101521RedLincoln Park41.92679-87.65521Entire home/apt3432377/29/180.293111
925879Top 2/1 Block to Fullerton L Red Line Deck & Yard101521RedLincoln Park41.92693-87.65753Entire home/apt9432475/11/200.373152

Last rows

idnamehost_idhost_nameneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
651347115140Upscale & Cozy 2bd 1bt Apt Mins From Dwntwn Beach27619788CrystalWoodlawn41.776940-87.608800Entire home/apt6320NaNNaN1166
651447116894Height of Luxury! Airy & Elegant River North Home376283542AntoinetteNear North Side41.894940-87.635130Entire home/apt159910NaNNaN1353
651547118155New 3 bedroom 2 bathroom apartment in rogers park.379312368RafatWest Ridge41.997890-87.688800Entire home/apt7610NaNNaN2365
651647121422Upscale Southport Abode near Everything--2 bed6187354ErinLake View41.949880-87.667290Entire home/apt72140NaNNaN162
651747123944Chicago Loft342913217KirstenNorth Center41.937440-87.684830Entire home/apt180010NaNNaN162
651847126307The Humboldt Jungalow4657251NathanHumboldt Park41.904030-87.716110Entire home/apt18040NaNNaN178
651947126361Lovely Flat Close to Rush Hospital / United Centre380761555EkremNear West Side41.879120-87.681380Private room2110NaNNaN1362
652047137445Entire apartment in West Town with Rooftop189403517EugeneWest Town41.900750-87.664920Entire home/apt115100NaNNaN114
652147140245Vintage 3BR hideaway near UChicago Med, Sanitized100179KennethWoodlawn41.784033-87.610653Entire home/apt4620NaNNaN1071
652247141177Modern Room in a Chicago Bungalow6617501NnamdiSouth Shore41.756159-87.585177Private room2320NaNNaN1284